Ellis County
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Kansas > Ellis County > Hays (0.04)
- Asia > China > Guangdong Province > Zhuhai (0.04)
- Africa > Nigeria (0.06)
- North America > United States > Washington (0.05)
- North America > United States > Texas (0.05)
- (3 more...)
- Media (1.00)
- Leisure & Entertainment > Sports (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- (3 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- (3 more...)
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Kansas > Ellis County > Hays (0.04)
- Asia > China > Guangdong Province > Zhuhai (0.04)
Fresh-CL: Feature Realignment through Experts on Hypersphere in Continual Learning
Zhou, Zhongyi, Peng, Yaxin, Yi, Pin, Zhu, Minjie, Shen, Chaomin
Continual Learning enables models to learn and adapt to new tasks while retaining prior knowledge. Introducing new tasks, however, can naturally lead to feature entanglement across tasks, limiting the model's capability to distinguish between new domain data. In this work, we propose a method called Feature Realignment through Experts on hyperSpHere in Continual Learning (Fresh-CL). By leveraging predefined and fixed simplex equiangular tight frame (ETF) classifiers on a hypersphere, our model improves feature separation both intra and inter tasks. However, the projection to a simplex ETF shifts with new tasks, disrupting structured feature representation of previous tasks and degrading performance. Therefore, we propose a dynamic extension of ETF through mixture of experts, enabling adaptive projections onto diverse subspaces to enhance feature representation. Experiments on 11 datasets demonstrate a 2% improvement in accuracy compared to the strongest baseline, particularly in fine-grained datasets, confirming the efficacy of combining ETF and MoE to improve feature distinction in continual learning scenarios.
- Asia > China > Shanghai > Shanghai (0.05)
- North America > United States > Kansas > Ellis County > Hays (0.04)
- Asia > Middle East > Jordan (0.04)
Improved prediction of future user activity in online A/B testing
Masoero, Lorenzo, Beraha, Mario, Richardson, Thomas, Favaro, Stefano
In online randomized experiments or A/B tests, accurate predictions of participant inclusion rates are of paramount importance. These predictions not only guide experimenters in optimizing the experiment's duration but also enhance the precision of treatment effect estimates. In this paper we present a novel, straightforward, and scalable Bayesian nonparametric approach for predicting the rate at which individuals will be exposed to interventions within the realm of online A/B testing. Our approach stands out by offering dual prediction capabilities--it forecasts both the quantity of new customers expected in future time windows and, unlike available alternative methods, the number of times they will be observed. We derive closedform expressions for the posterior distributions of the quantities needed to form predictions about future user activity, thereby bypassing the need for numerical algorithms such as Markov chain Monte Carlo. After a comprehensive exposition of our model, we test its performance on experiments on real and simulated data, where we show its superior performance with respect to existing alternatives in the literature. 1 Introduction The problem of predicting the size of a population from which random samples are drawn has a long history in the statistics literature. Originally motivated by applications in ecology, where the goal is typically to determine the number of distinct species of animals within a population (Fisher et al., 1943; Good, 1953; Burnham and Overton, 1979), a variation of this problem has recently received considerable attention also in the genomics literature, where scientists are interested in predicting the number of future rare variants to be observed within a genomic study (Ionita-Laza et al., 2009; Zou et al., 2016; Chakraborty et al., 2019; Masoero et al., 2022).
- North America > United States > Kansas > Ellis County (0.04)
- Asia > Middle East > Jordan (0.04)
- Research Report > Strength High (1.00)
- Research Report > Experimental Study (1.00)
Boosting Out-of-Distribution Detection with Multiple Pre-trained Models
Xue, Feng, He, Zi, Xie, Chuanlong, Tan, Falong, Li, Zhenguo
Out-of-Distribution (OOD) detection, i.e., identifying whether an input is sampled from a novel distribution other than the training distribution, is a critical task for safely deploying machine learning systems in the open world. Recently, post hoc detection utilizing pre-trained models has shown promising performance and can be scaled to large-scale problems. This advance raises a natural question: Can we leverage the diversity of multiple pre-trained models to improve the performance of post hoc detection methods? In this work, we propose a detection enhancement method by ensembling multiple detection decisions derived from a zoo of pre-trained models. Our approach uses the p-value instead of the commonly used hard threshold and leverages a fundamental framework of multiple hypothesis testing to control the true positive rate of In-Distribution (ID) data. We focus on the usage of model zoos and provide systematic empirical comparisons with current state-of-the-art methods on various OOD detection benchmarks. The proposed ensemble scheme shows consistent improvement compared to single-model detectors and significantly outperforms the current competitive methods. Our method substantially improves the relative performance by 65.40% and 26.96% on the CIFAR10 and ImageNet benchmarks.
- North America > United States > Kansas > Ellis County > Hays (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Connecticut > New Haven County > New Haven (0.14)
- North America > United States > Arkansas (0.04)
- North America > United States > New York (0.04)
- (12 more...)
- Media > News (1.00)
- Health & Medicine (1.00)
- Government > Regional Government > North America Government > United States Government (1.00)
- (5 more...)
A Linear Algebraic Approach to Model Parallelism in Deep Learning
Hewett, Russell J., Grady, Thomas J. II
Training deep neural networks (DNNs) in large-cluster computing environments is increasingly necessary, as networks grow in size and complexity. Local memory and processing limitations require robust data and model parallelism for crossing compute node boundaries. We propose a linear-algebraic approach to model parallelism in deep learning, which allows parallel distribution of any tensor in the DNN. Rather than rely on automatic differentiation tools, which do not universally support distributed memory parallelism models, we show that parallel data movement operations, e.g., broadcast, sum-reduce, and halo exchange, are linear operators, and by defining the relevant spaces and inner products, we manually develop the adjoint, or backward, operators required for gradient-based training of DNNs. We build distributed DNN layers using these parallel primitives, composed with sequential layer implementations, and demonstrate their application by building and training a distributed DNN using DistDL, a PyTorch and MPI-based distributed deep learning toolkit.
- North America > United States > Virginia > Montgomery County > Blacksburg (0.04)
- North America > United States > Texas > Dallas County > Dallas (0.04)
- North America > United States > Kansas > Ellis County (0.04)
- (4 more...)
Multi-Task Learning by a Top-Down Control Network
A general problem that received considerable recent attention is how to perform multiple tasks in the same network, maximizing both prediction accuracy and efficiency of training. Recent approaches address this problem by branching networks, or by a channel-wise modulation of the feature-maps with task specific vectors. We propose a novel architecture that uses a top-down network to modify the main network according to the task in a channel-wise, as well as spatial-wise, image-dependent computation scheme. We show the effectiveness of our scheme by achieving better results than alternative state-of-the-art approaches to multi-task learning. We also demonstrate our advantages in terms of task selectivity, scaling the number of tasks, learning from fewer examples and interpretability.
- North America > United States > Kansas > Ellis County (0.04)
- North America > United States > California (0.04)
- Asia > Middle East > Israel (0.04)